A Printed PAW Image Database of Arabic Language for Document Analysis and Recognition
نویسندگان
چکیده
منابع مشابه
A Database for Arabic Printed Character Recognition
Electronic Document Management (EDM) technology is being widely adopted as it makes for the efficient routing and retrieval of documents. Optical Character Recognition (OCR) is an important front end for such technology. Excellent OCR now exists for Latin based languages, but there are few systems that read Arabic, which limits the penetration of EDM into Arabicspeaking countries. In developing...
متن کاملPersian Printed Document Analysis and Page Segmentation
This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...
متن کاملRetrieving Arabic Printed Document: a Survey
This paper surveys some of the literature pertaining to searching and retrieving OCR’ed printed documents with emphasis on Arabic documents. It examines peculiarities of Arabic morphology, orthography, retrieval, word clustering, display, OCR, and error correction. The paper surveys existing evaluation test-beds for retrieval of Arabic OCR texts. Lastly, it concludes with possible directions fo...
متن کاملOffline printed Arabic character recognition
........................................................................................................................ i Acknowledgements ...................................................................................................... ii Table of
متن کاملSegmentation and Recognition of Printed Arabic Characters
Arabic characters differ significantly from other characters such as Latin and Chinese characters in that they are written cursively in both printed and handwritten forms and consist of 28 main characters. However most of their shapes change according to their position in the word. These shapes together with some other secondaries raise the number of classes to 120. Furthermore, some of these c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of ICT Research and Applications
سال: 2017
ISSN: 2338-5499,2337-5787
DOI: 10.5614/itbj.ict.res.appl.2017.11.2.6